Noncentral Chi-Square distribution (ncx2) — squared norms & test power#

The noncentral chi-square distribution generalizes the usual chi-square by allowing a mean shift in the underlying Gaussian components.

A core generative story is:

[ Z \sim \mathcal N(\mu, I_k), \qquad X = |Z|2^2 = \sum{i=1}^k Z_i^2 ;\sim; \chi’^2_k(\lambda), ]

where the noncentrality is the squared mean magnitude

[ \lambda = |\mu|_2^2. ]

It shows up whenever your statistic is a sum of squares under an alternative hypothesis: power of \(z/t/F/\chi^2\) tests, signal detection, and more.

What you’ll learn#

  • classification, support, and parameter space \(( u,\lambda)\)

  • the PDF (with a modified Bessel function) and practical CDF representations

  • moments (mean/variance/skewness/kurtosis), MGF/CF, and entropy notes

  • how \( u\) and \(\lambda\) change the shape

  • a NumPy-only sampler via the Poisson–chi-square mixture

  • practical usage via scipy.stats.ncx2 (pdf, cdf, rvs, fit)

import numpy as np
import plotly.graph_objects as go
import os
import plotly.io as pio

from scipy import optimize, special, stats

# Plotly rendering (CKC convention)
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

# Reproducibility
rng = np.random.default_rng(7)

np.set_printoptions(precision=4, suppress=True)

import scipy, plotly
print("NumPy", np.__version__)
print("SciPy", scipy.__version__)
print("Plotly", plotly.__version__)
NumPy 1.26.2
SciPy 1.15.0
Plotly 6.5.2

1) Title & Classification#

  • Name: noncentral chi-square distribution (ncx2)

  • Type: continuous

  • Support: \(x\in[0,\infty)\)

  • Parameter space: degrees of freedom \( u>0\), noncentrality \(\lambda\ge 0\)

  • Common notation: \(X\sim\chi'^2_{ u}(\lambda)\)

  • SciPy parameterization: stats.ncx2(df=ν, nc=λ, loc=0, scale=1) (with optional loc\in\mathbb R, scale>0)

2) Intuition & Motivation#

What it models#

  • Energy of a shifted Gaussian vector: \(X=\|Z\|_2^2\) with \(Z\sim\mathcal N(\mu, I)\).

  • Sum of squared signal + noise: \(\sum_i (s_i + arepsilon_i)^2\) where \( arepsilon_i\) are Gaussian noise terms.

Typical real-world use cases#

  • Power calculations for tests that reduce to squared normal statistics (e.g. a two-sided \(z\)-test can be written in terms of \(Z^2\)).

  • Signal detection (energy detectors in radar/communications): distribution of observed energy under \(H_1\).

  • Likelihood ratio / score / Wald tests in large-sample settings: many asymptotic test statistics are (noncentral) chi-square.

Relations to other distributions#

  • Setting \(\lambda=0\) recovers the central chi-square: \(\chi'^2_{ u}(0)=\chi^2_{ u}\).

  • Poisson mixture: if \(N\sim\mathrm{Poisson}(\lambda/2)\), then [ X\mid N \sim \chi^2_{ u+2N}. ]

  • If \(Z\sim\mathcal N(\delta,1)\), then \(Z^2\sim\chi'^2_1(\delta^2)\).

  • Ratios lead to noncentral families: if \(X_1\sim\chi'^2_{ u_1}(\lambda)\) and \(X_2\sim\chi^2_{ u_2}\) are independent, then [ rac{(X_1/ u_1)}{(X_2/ u_2)}\sim F’_{ u_1, u_2}(\lambda). ]

  • Additivity: if \(X_i\sim \chi'^2_{ u_i}(\lambda_i)\) are independent, then \(\sum_i X_i\sim\chi'^2_{\sum u_i}(\sum\lambda_i)\).

3) Formal Definition#

We write \(X\sim\chi'^2_{ u}(\lambda)\) with \( u>0\) and \(\lambda\ge 0\).

PDF#

For \(x>0\),

[ f(x; u,\lambda) = rac12,\exp!\left(- rac{x+\lambda}{2} ight) \left( rac{x}{\lambda} ight)^{ u/4-1/2} I_{ u/2-1}!\left(\sqrt{\lambda x} ight), ]

where \(I_{lpha}(\cdot)\) is the modified Bessel function of the first kind.

For \(\lambda=0\) this reduces to the central chi-square density

[ f(x; u,0)= rac{1}{2^{ u/2},\Gamma( u/2)}x^{ u/2-1}e^{-x/2},\qquad x>0. ]

CDF#

A common special-function representation uses the generalized Marcum \(Q\)-function \(Q_m\):

[ F(x; u,\lambda)=\mathbb P(X\le x)=1-Q_{ u/2}(\sqrt{\lambda},\sqrt{x}). ]

A numerically useful series view is a Poisson-weighted mixture of central chi-squares:

[ F(x; u,\lambda) = \sum_{j=0}^{\infty} w_j,F_{\chi^2_{ u+2j}}(x), \qquad w_j = e^{-\lambda/2} rac{(\lambda/2)^j}{j!}. ]

We’ll use this mixture again for sampling.

def ncx2_logpdf(x: np.ndarray, df: float, nc: float) -> np.ndarray:
    """Numerically stable log-PDF for the *standard* noncentral chi-square.

    Uses exp-scaled modified Bessel I (`scipy.special.ive`) and falls back
    to the central chi-square formula when `nc=0`.

    Notes
    -----
    This is mainly for educational purposes; `scipy.stats.ncx2.logpdf`
    is battle-tested and should be preferred for production.
    """

    x = np.asarray(x, dtype=float)
    df = float(df)
    nc = float(nc)

    if df <= 0:
        raise ValueError("df must be > 0")
    if nc < 0:
        raise ValueError("nc must be >= 0")

    out = np.full_like(x, -np.inf, dtype=float)
    mask = x > 0

    if nc == 0.0:
        xm = x[mask]
        out[mask] = (
            (df / 2 - 1) * np.log(xm)
            - xm / 2
            - (df / 2) * np.log(2.0)
            - special.gammaln(df / 2)
        )
        return out

    xm = x[mask]
    z = np.sqrt(nc * xm)
    v = df / 2 - 1.0

    # log(I_v(z)) via exponentially scaled ive: I_v(z) = exp(z) * ive(v, z) for z>0
    log_iv = np.log(special.ive(v, z)) + z

    out[mask] = (
        -np.log(2.0)
        - 0.5 * (xm + nc)
        + (df / 4 - 0.5) * (np.log(xm) - np.log(nc))
        + log_iv
    )
    return out


def ncx2_pdf(x: np.ndarray, df: float, nc: float) -> np.ndarray:
    return np.exp(ncx2_logpdf(x, df, nc))


def ncx2_cdf_poisson(
    x: np.ndarray,
    df: float,
    nc: float,
    *,
    tol: float = 1e-12,
    max_terms: int = 10_000,
) -> np.ndarray:
    """CDF via the Poisson-mixture representation.

    F(x) = sum_j w_j * chi2_cdf(x; df+2j),   w_j ~ Poisson(nc/2)

    This is convenient and conceptually clear, but may require many terms
    when `nc` is very large.
    """

    x = np.asarray(x, dtype=float)
    df = float(df)
    nc = float(nc)

    if df <= 0:
        raise ValueError("df must be > 0")
    if nc < 0:
        raise ValueError("nc must be >= 0")

    cdf = np.zeros_like(x, dtype=float)
    cdf = np.where(x < 0, 0.0, cdf)

    mask = x >= 0
    if not np.any(mask):
        return cdf

    xm = x[mask]

    if nc == 0.0:
        cdf[mask] = stats.chi2(df).cdf(xm)
        return cdf

    lam = nc / 2.0
    w = np.exp(-lam)
    weight_sum = w

    acc = w * stats.chi2(df).cdf(xm)

    j = 0
    while (1.0 - weight_sum) > tol and j < max_terms:
        j += 1
        w *= lam / j
        weight_sum += w
        acc += w * stats.chi2(df + 2 * j).cdf(xm)

    cdf[mask] = np.minimum(acc, 1.0)
    return cdf


# Quick sanity check: PDF integrates to ~1 for a moderate parameter choice
_df0, _nc0 = 5.0, 6.0
_dist0 = stats.ncx2(_df0, _nc0)

xgrid = np.linspace(1e-6, _dist0.ppf(0.9999), 50_000)
area_scipy = np.trapz(_dist0.pdf(xgrid), xgrid)
area_numpy = np.trapz(ncx2_pdf(xgrid, _df0, _nc0), xgrid)
area_numpy, area_scipy
(0.9998999999954213, 0.999899999995421)

4) Moments & Properties#

For \(X\sim\chi'^2_{ u}(\lambda)\):

Quantity

Value

Mean

$\mathbb E[X]=

u+\lambda$

Variance

$\mathrm{Var}(X)=2(

u+2\lambda)$

Skewness

$\gamma_1 = \dfrac{\sqrt{8}(

u+3\lambda)}{(

u+2\lambda)^{3/2}}$

Excess kurtosis

$\gamma_2 = \dfrac{12(

u+4\lambda)}{(

u+2\lambda)^2}$

MGF

$M(t)=(1-2t)^{-

u/2}\exp!ig( frac{\lambda t}{1-2t}ig)\( for \)t< frac12$

CF

$ arphi(t)=(1-2it)^{-

u/2}\exp!ig( frac{\lambda (it)}{1-2it}ig)$

Entropy: unlike the central chi-square, the noncentral case has no simple closed-form expression; it’s typically evaluated numerically (e.g. scipy.stats.ncx2(...).entropy()).

def ncx2_moments(df: float, nc: float) -> dict:
    df = float(df)
    nc = float(nc)
    if df <= 0:
        raise ValueError("df must be > 0")
    if nc < 0:
        raise ValueError("nc must be >= 0")

    mean = df + nc
    var = 2.0 * (df + 2.0 * nc)
    skew = np.sqrt(8.0) * (df + 3.0 * nc) / (df + 2.0 * nc) ** 1.5
    ex_kurt = 12.0 * (df + 4.0 * nc) / (df + 2.0 * nc) ** 2

    return {
        "mean": mean,
        "var": var,
        "skew": skew,
        "excess_kurtosis": ex_kurt,
    }


def ncx2_mgf(t: np.ndarray, df: float, nc: float) -> np.ndarray:
    t = np.asarray(t, dtype=float)
    if np.any(t >= 0.5):
        raise ValueError("MGF exists only for t < 1/2")
    df = float(df)
    nc = float(nc)

    denom = 1.0 - 2.0 * t
    return denom ** (-df / 2) * np.exp(nc * t / denom)


def ncx2_cf(t: np.ndarray, df: float, nc: float) -> np.ndarray:
    t = np.asarray(t, dtype=float)
    df = float(df)
    nc = float(nc)

    denom = 1.0 - 2.0j * t
    return denom ** (-df / 2) * np.exp(nc * (1.0j * t) / denom)


df0, nc0 = 5.0, 6.0
m = ncx2_moments(df0, nc0)

# Compare to SciPy's built-in stats
mean_s, var_s, skew_s, exkurt_s = stats.ncx2(df0, nc0).stats(moments="mvsk")
entropy_s = stats.ncx2(df0, nc0).entropy()

m, (mean_s, var_s, skew_s, exkurt_s), entropy_s
({'mean': 11.0,
  'var': 34.0,
  'skew': 0.9281099901829891,
  'excess_kurtosis': 1.2041522491349481},
 (11.0, 34.0, 0.9281099901829891, 1.2041522491349481),
 3.0995600010269304)
# Monte Carlo check of mean/variance and the MGF at a few t
n = 200_000
samples = stats.ncx2(df0, nc0).rvs(size=n, random_state=rng)

mc_mean = float(np.mean(samples))
mc_var = float(np.var(samples))

t_vals = np.array([-0.2, -0.05, 0.05, 0.15])
mgf_mc = np.array([np.mean(np.exp(t * samples)) for t in t_vals])
mgf_th = ncx2_mgf(t_vals, df0, nc0)

(mc_mean, mc_var), (m["mean"], m["var"]), np.c_[t_vals, mgf_mc, mgf_th]
((11.021687398489853, 34.00836025071894),
 (11.0, 34.0),
 array([[-0.2   ,  0.1822,  0.183 ],
        [-0.05  ,  0.5992,  0.5999],
        [ 0.05  ,  1.8182,  1.8162],
        [ 0.15  ,  8.8698,  8.8234]]))

5) Parameter Interpretation#

Meaning of the parameters#

  • Degrees of freedom \( u\): roughly “how many squared Gaussian components” contribute. Larger \( u\) pushes the mass right and makes the distribution less skewed.

  • Noncentrality \(\lambda\): the squared mean shift in the underlying Gaussian story.

If \(Z\sim\mathcal N(\mu,I_ u)\) then \(\lambda=\|\mu\|^2\). In hypothesis testing, \(\lambda\) often equals a squared effect size (e.g. \(\delta^2\)) and controls power.

Shape changes#

  • Increasing \( u\) (holding \(\lambda\) fixed) tends to make the density more symmetric and moves the mean right.

  • Increasing \(\lambda\) (holding \( u\) fixed) shifts mass right and increases dispersion (since \(\mathrm{Var}(X)=2( u+2\lambda)\)).

We’ll visualize these effects next.

# PDF shape for different (df, nc) combinations
param_sets = [
    (1.0, 0.0, "df=1, nc=0 (chi-square)") ,
    (5.0, 0.0, "df=5, nc=0"),
    (5.0, 3.0, "df=5, nc=3"),
    (5.0, 10.0, "df=5, nc=10"),
    (10.0, 10.0, "df=10, nc=10"),
]

# Choose an x-range that covers all parameter sets reasonably well
x_max = max(stats.ncx2(df, nc).ppf(0.999) for df, nc, _ in param_sets)
x = np.linspace(1e-6, x_max, 800)

fig = go.Figure()
for df, nc, label in param_sets:
    fig.add_trace(go.Scatter(x=x, y=stats.ncx2(df, nc).pdf(x), mode="lines", name=label))

fig.update_layout(
    title="Noncentral chi-square PDFs for several parameter choices",
    xaxis_title="x",
    yaxis_title="density",
    width=950,
    height=520,
)
fig.show()
# Same mean, different (df, nc) => different variance/shape
# mean = df + nc, var = 2(df + 2nc)
mean_target = 12.0
param_same_mean = [
    (12.0, 0.0, "(df=12, nc=0)"),
    (8.0, 4.0, "(df=8, nc=4)"),
    (4.0, 8.0, "(df=4, nc=8)"),
]

x_max = max(stats.ncx2(df, nc).ppf(0.999) for df, nc, _ in param_same_mean)
x = np.linspace(1e-6, x_max, 800)

fig = go.Figure()
for df, nc, label in param_same_mean:
    v = 2 * (df + 2 * nc)
    fig.add_trace(
        go.Scatter(
            x=x,
            y=stats.ncx2(df, nc).pdf(x),
            mode="lines",
            name=f"{label} (var={v:.1f})",
        )
    )

fig.update_layout(
    title=f"Different shapes with the same mean (mean={mean_target:.0f})",
    xaxis_title="x",
    yaxis_title="density",
    width=950,
    height=520,
)
fig.show()

6) Derivations#

Expectation and variance via the MGF#

The MGF of \(X\sim\chi'^2_{ u}(\lambda)\) is

[ M(t)=(1-2t)^{- u/2}\exp!\left( rac{\lambda t}{1-2t} ight),\qquad t< frac12. ]

Then

[ \mathbb E[X]=M’(0)= u+\lambda \qquad ext{and}\qquad \mathrm{Var}(X)=M’’(0)-M’(0)^2=2( u+2\lambda). ]

Expectation and variance via the Poisson mixture#

Using \(X\mid N\sim\chi^2_{ u+2N}\) with \(N\sim\mathrm{Poisson}(\lambda/2)\):

  • \(\mathbb E[X\mid N]= u+2N\) and \(\mathrm{Var}(X\mid N)=2( u+2N)\).

  • By the law of total expectation, \(\mathbb E[X]= u+2\mathbb E[N]= u+\lambda\).

  • By the law of total variance, [ \mathrm{Var}(X)=\mathbb E[\mathrm{Var}(X\mid N)] + \mathrm{Var}(\mathbb E[X\mid N]) = 2( u+\lambda) + 4,\mathrm{Var}(N) = 2( u+\lambda) + 4\cdot rac{\lambda}{2} = 2( u+2\lambda). ]

Likelihood#

For i.i.d. data \(x_1,\dots,x_n\),

[ \ell( u,\lambda) = \sum_{i=1}^n \log f(x_i; u,\lambda). ]

Unlike the central chi-square (a Gamma family), the presence of the Bessel term means there is no closed-form MLE in general; parameters are typically estimated numerically.

def ncx2_loglikelihood(x: np.ndarray, df: float, nc: float) -> float:
    x = np.asarray(x, dtype=float)
    if np.any(x < 0):
        return -np.inf
    return float(np.sum(stats.ncx2(df, nc).logpdf(x)))


# Likelihood surface (fix df, vary nc)
df_true, nc_true = 4.0, 7.0
x_obs = stats.ncx2(df_true, nc_true).rvs(size=4000, random_state=rng)

nc_grid = np.linspace(0.0, 18.0, 200)
ll = np.array([ncx2_loglikelihood(x_obs, df_true, nc) for nc in nc_grid])

res = optimize.minimize_scalar(
    lambda nc: -ncx2_loglikelihood(x_obs, df_true, nc),
    bounds=(0.0, 30.0),
    method="bounded",
)

nc_mle = float(res.x)

fig = go.Figure()
fig.add_trace(go.Scatter(x=nc_grid, y=ll, mode="lines", name="log-likelihood"))
fig.add_vline(x=nc_true, line_dash="dash", line_color="green", annotation_text="true nc")
fig.add_vline(x=nc_mle, line_dash="dot", line_color="red", annotation_text="MLE (df fixed)")

fig.update_layout(
    title="Log-likelihood as a function of noncentrality (df fixed)",
    xaxis_title="nc",
    yaxis_title="log-likelihood",
    width=950,
    height=520,
)
fig.show()

nc_true, nc_mle
(7.0, 6.881234572685431)

7) Sampling & Simulation#

A very useful identity is the Poisson mixture:

[ N\sim\mathrm{Poisson}(\lambda/2),\qquad X\mid N \sim \chi^2_{ u+2N}. ]

So sampling \(X\sim\chi'^2_{ u}(\lambda)\) can be done by:

  1. Sample \(N\sim\mathrm{Poisson}(\lambda/2)\).

  2. Sample \(Y\sim\chi^2_{ u+2N}\).

  3. Return \(X=Y\).

Because a central chi-square is a Gamma,

[ \chi^2_{k};\equiv;\mathrm{Gamma}\left(lpha= frac{k}{2},; heta=2 ight), ]

we just need a NumPy-only Gamma sampler. We’ll reuse Marsaglia–Tsang (2000).

def gamma_rvs_numpy(shape: float, size: int, rng: np.random.Generator) -> np.ndarray:
    '''Sample Gamma(shape, scale=1) using NumPy only (Marsaglia-Tsang).'''

    k = float(shape)
    if k <= 0:
        raise ValueError("shape must be > 0")

    # k < 1: boost to k+1 and apply power transform
    if k < 1:
        g = gamma_rvs_numpy(k + 1.0, size, rng)
        u = rng.random(size)
        return g * (u ** (1.0 / k))

    # k >= 1: Marsaglia–Tsang
    d = k - 1.0 / 3.0
    c = 1.0 / np.sqrt(9.0 * d)

    out = np.empty(size, dtype=float)
    filled = 0

    while filled < size:
        n = size - filled
        x = rng.standard_normal(n)
        v = 1.0 + c * x
        v = v * v * v  # (1 + c x)^3
        u = rng.random(n)

        positive = v > 0

        # First (cheap) acceptance
        accept = positive & (u < 1.0 - 0.0331 * (x**4))

        # Second acceptance (log test)
        log_v = np.zeros_like(v)
        log_v[positive] = np.log(v[positive])

        accept2 = positive & (~accept) & (
            np.log(u) < 0.5 * x * x + d * (1.0 - v + log_v)
        )

        accept = accept | accept2
        accepted = d * v[accept]

        take = min(accepted.size, n)
        out[filled : filled + take] = accepted[:take]
        filled += take

    return out


def chisquare_rvs_numpy(df: float, size: int, rng: np.random.Generator) -> np.ndarray:
    '''Sample ChiSquare(df) using Gamma(df/2, scale=2).'''
    return 2.0 * gamma_rvs_numpy(df / 2.0, size, rng)


def ncx2_rvs_numpy(df: float, nc: float, size: int, rng: np.random.Generator) -> np.ndarray:
    '''Sample noncentral chi-square via the Poisson mixture (NumPy only).'''

    df = float(df)
    nc = float(nc)
    if df <= 0:
        raise ValueError("df must be > 0")
    if nc < 0:
        raise ValueError("nc must be >= 0")

    n = rng.poisson(nc / 2.0, size=size)
    out = np.empty(size, dtype=float)

    # Group by unique Poisson counts to avoid per-sample loops
    for n_val in np.unique(n):
        mask = n == n_val
        df_eff = df + 2.0 * float(n_val)
        out[mask] = chisquare_rvs_numpy(df_eff, int(np.sum(mask)), rng)

    return out


# Monte Carlo validation
n = 120_000
samples_numpy = ncx2_rvs_numpy(df0, nc0, n, rng)

(np.mean(samples_numpy), np.var(samples_numpy)), (m["mean"], m["var"])
((10.99531473084796, 34.053812112584495), (11.0, 34.0))
# Compare NumPy-only sampler to SciPy (quick KS test)
dist = stats.ncx2(df0, nc0)
ks_stat, ks_p = stats.kstest(samples_numpy[::10], dist.cdf)  # subsample for speed
ks_stat, ks_p
(0.00580082933823875, 0.8118991628381108)

8) Visualization#

We’ll visualize:

  • the theoretical PDF and CDF

  • Monte Carlo samples from our NumPy-only sampler

# PDF + histogram (Monte Carlo)
dist = stats.ncx2(df0, nc0)
x = np.linspace(1e-6, dist.ppf(0.9995), 800)

fig = go.Figure()
fig.add_trace(
    go.Histogram(
        x=samples_numpy,
        nbinsx=70,
        histnorm="probability density",
        name="Monte Carlo (NumPy-only)",
        opacity=0.55,
    )
)
fig.add_trace(go.Scatter(x=x, y=dist.pdf(x), mode="lines", name="True PDF (SciPy)"))

fig.update_layout(
    title=f"Noncentral chi-square PDF with samples (df={df0}, nc={nc0})",
    xaxis_title="x",
    yaxis_title="density",
    width=950,
    height=520,
)
fig.show()
# CDF: theoretical vs empirical
x = np.linspace(0, dist.ppf(0.9995), 700)

emp_x = np.sort(samples_numpy)
emp_cdf = np.arange(1, emp_x.size + 1) / emp_x.size

fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=dist.cdf(x), mode="lines", name="True CDF (SciPy)"))
fig.add_trace(
    go.Scatter(
        x=emp_x[::300],
        y=emp_cdf[::300],
        mode="markers",
        name="Empirical CDF (subsample)",
    )
)

fig.update_layout(
    title="CDF comparison",
    xaxis_title="x",
    yaxis_title="F(x)",
    width=950,
    height=520,
)
fig.show()

9) SciPy Integration (scipy.stats.ncx2)#

scipy.stats.ncx2 implements the standard distribution (plus loc/scale). Common methods:

  • pdf(x), logpdf(x)

  • cdf(x), sf(x) (often prefer sf for tiny tail probabilities)

  • rvs(size=..., random_state=...)

  • fit(data, ...) for MLE

dist = stats.ncx2(df0, nc0)

x_test = np.array([0.5, 2.0, 8.0])

print('pdf:', dist.pdf(x_test))
print('cdf:', dist.cdf(x_test))
print('sf :', dist.sf(x_test))
print('rvs:', dist.rvs(size=5, random_state=rng))

# Fit (MLE) on synthetic data; fix loc=0, scale=1 to estimate only (df, nc)
data = dist.rvs(size=4000, random_state=rng)
df_hat, nc_hat, loc_hat, scale_hat = stats.ncx2.fit(data, floc=0, fscale=1)

(df0, nc0), (df_hat, nc_hat, loc_hat, scale_hat)
pdf: [0.0024 0.0196 0.0749]
cdf: [0.0005 0.0158 0.3463]
sf : [0.9995 0.9842 0.6537]
rvs: [12.0142 14.2524  4.0148 15.9244  9.1567]
((5.0, 6.0), (5.415276209807526, 5.432624331417813, 0, 1))

10) Statistical Use Cases#

10.1 Hypothesis testing: power of a two-sided \(z\)-test#

If \(ar X\sim\mathcal N(\mu,\sigma^2/n)\) and you test \(H_0:\mu=0\) with

[ Z = rac{\sqrt{n},ar X}{\sigma}, ]

then under \(H_1\) with true mean \(\mu=\mu_1\) we have \(Z\sim\mathcal N(\delta,1)\) where \(\delta=\sqrt{n}\,\mu_1/\sigma\). So

[ Z^2\sim\chi’^2_1(\delta^2), ]

and the two-sided rejection region \(|Z|>z_{1-lpha/2}\) becomes \(Z^2>z_{1-lpha/2}^2\).

10.2 Bayesian modeling: infer the noncentrality#

In detection problems you may observe an energy-like statistic \(x\) and treat \(\lambda\) (signal strength) as unknown:

[ \lambda \sim p(\lambda),\qquad x\mid\lambda \sim \chi’^2_{ u}(\lambda). ]

The posterior is proportional to \(p(\lambda)\,f(x; u,\lambda)\) (usually computed numerically).

10.3 Generative modeling: squared norm of a Gaussian#

If \(Z\sim\mathcal N(\mu,I_k)\) then \(\|Z\|^2\sim\chi'^2_k(\|\mu\|^2)\). This gives a simple generator for positive “energy” features.

# 10.1 Power curve for a two-sided z-test using ncx2
alpha = 0.05
zcrit = stats.norm.ppf(1 - alpha / 2)
crit = zcrit**2

sigma = 1.0
n = 40

mu_vals = np.linspace(0.0, 0.8, 120)
delta2 = (np.sqrt(n) * mu_vals / sigma) ** 2
power = 1.0 - stats.ncx2(df=1, nc=delta2).cdf(crit)

fig = go.Figure()
fig.add_trace(go.Scatter(x=mu_vals, y=power, mode="lines", name="power"))
fig.update_layout(
    title=f"Power of two-sided z-test (alpha={alpha}, n={n}, sigma={sigma})",
    xaxis_title="true mean |μ1| (effect size)",
    yaxis_title="power",
    yaxis_range=[0, 1],
    width=950,
    height=520,
)
fig.show()

power[:5], crit
(array([0.05  , 0.0502, 0.0508, 0.0519, 0.0533]), 3.8414588206941254)
# 10.2 Bayesian inference on nc (lambda) via a grid posterior
nu = 4.0
x_obs = 14.0

# Prior: Gamma(a, rate=b) on lambda
# (not conjugate here; this is just a reasonable positive prior)
a, b = 2.0, 0.3  # mean a/b ~ 6.67

lam_grid = np.linspace(0.0, 40.0, 900)

log_prior = (a - 1) * np.log(lam_grid + 1e-12) - b * lam_grid  # unnormalized
log_like = stats.ncx2(nu, lam_grid).logpdf(x_obs)
log_post = log_prior + log_like

log_post -= np.max(log_post)
post = np.exp(log_post)
post /= np.trapz(post, lam_grid)

post_mean = float(np.trapz(lam_grid * post, lam_grid))

fig = go.Figure()
fig.add_trace(go.Scatter(x=lam_grid, y=post, mode="lines", name="posterior"))
fig.add_vline(x=post_mean, line_dash="dash", line_color="red", annotation_text="posterior mean")
fig.update_layout(
    title=f"Posterior over nc=λ given x={x_obs} (df={nu})",
    xaxis_title="λ",
    yaxis_title="posterior density",
    width=950,
    height=520,
)
fig.show()

post_mean
7.96872510672175
# 10.3 Generative model: ||N(mu, I)||^2 matches ncx2
k = 3
mu = np.array([1.5, -0.5, 0.75])
lam = float(mu @ mu)

z = rng.standard_normal((80_000, k)) + mu
x = np.sum(z * z, axis=1)

dist = stats.ncx2(k, lam)
xx = np.linspace(1e-6, dist.ppf(0.9995), 800)

fig = go.Figure()
fig.add_trace(
    go.Histogram(
        x=x,
        nbinsx=70,
        histnorm="probability density",
        name="Monte Carlo: ||N(μ,I)||^2",
        opacity=0.55,
    )
)
fig.add_trace(go.Scatter(x=xx, y=dist.pdf(xx), mode="lines", name=f"ncx2(df={k}, nc=||μ||^2={lam:.3f})"))

fig.update_layout(
    title="Squared norm of a shifted Gaussian is noncentral chi-square",
    xaxis_title="x",
    yaxis_title="density",
    width=950,
    height=520,
)
fig.show()

lam
3.0625

11) Pitfalls#

  • Invalid parameters: you must have df > 0 and nc >= 0.

  • Parameterization gotcha: SciPy’s ncx2 also supports loc and scale; most theory assumes loc=0, scale=1.

  • PDF at/near 0: depending on \(( u,\lambda)\) the density can be very steep near zero; avoid evaluating exactly at x=0 in naive code.

  • Numerical stability: the Bessel term can overflow for large \(\lambda x\); prefer logpdf/ive-based formulas.

  • Tail probabilities: prefer sf over 1-cdf when probabilities are tiny.

  • Poisson-mixture truncation: the series CDF needs many terms when \(\lambda\) is large; production code uses specialized algorithms.

12) Summary#

  • ncx2 is a continuous distribution on \([0,\infty)\) with parameters \(( u,\lambda)\).

  • It is the distribution of squared norms of shifted Gaussian vectors and underpins many power calculations.

  • The PDF involves a modified Bessel function; a key computational view is the Poisson mixture of central chi-squares.

  • Mean/variance are simple: \( u+\lambda\) and \(2( u+2\lambda)\).

  • Sampling is easy with NumPy using the Poisson-mixture + Gamma sampler.